Query Ordering Based Top-k Algorithms for Qualitatively Specified Preferences
نویسندگان
چکیده
Preference modelling and management has attracted considerable attention in the areas of Databases, Knowledge Bases and Information Retrieval Systems in recent years. This interest stems from the fact that a rapidly growing class of untrained lay users confront vast data collections, usually through the Internet, typically lacking a clear view of either content or structure, moreover, not even having a particular object in mind. Rather, they are attempting to discover potentially useful objects, in other words, objects that best suit their preferences. A modern information system, consequently, should enable users to quickly focus on the k best object according to their preferences. In this thesis, modelling preferences as binary relations, we introduce efficient algorithms for the evaluation of the top-k objects. Previous related work treated preference expressions as black boxes and dealt with the idea of exhaustively applying dominance tests among database objects in order to determine the best ones, resulting in quadratic costs. On the contrary, we advocate a query ordering based approach. Our key idea is to exploit the semantics of the input preference expression itself, in terms of both the operators and the preferences involved, to define an ordering over those queries, whose evaluation is necessary for the retrieval of the top-k objects. We introduce two novel algorithms, LBA and TBA. LBA defines an ordering over queries which are essentially conjunctions of atomic selection conditions, containing all attributes that the user preferences involve. The algorithm ensures that the way and order in which objects are fetched respect user preferences, avoiding any dominance testing, and accessing only the top-k objects, each of them only once. From a different angle, TBA defines an order of queries which are disjunctions of atomic selection conditions over single attributes, and uses appropriate threshold values to signal object fetching termination, ensuring that all remaining objects are worse than those fetched. Dominance tests are performed only for already retrieved objects. Analytical study and experimental evaluation show that our algorithms outperform existing ones under all problem instances. Supervisor: Vassilis Christophides Associate Professor ΑΛΓΟΡΙΘΜΟΙ ΚΟΡΥΦΑΙΩΝ-Κ ΑΠΑΝΤΗΣΕΩΝ ΒΑΣΙΣΜΕΝΟΙ ΣΕ ΔΙΑΤΑΞΕΙΣ ΕΠΕΡΩΤΗΣΕΩΝ ΓΙΑ ΠΟΙΟΤΙΚΩΣ ΚΑΘΟΡΙΣΜΕΝΕΣ ΠΡΟΤΙΜΗΣΕΙΣ Ιωάννης Καπανταϊδάκης Μεταπτυχιακή Εργασία Τμήμα Επιστήμης Υπολογιστών Πανεπιστήμιο Κρήτης
منابع مشابه
Computing Immutable Regions for Subspace Top-k Queries
Given a high-dimensional dataset, a top-k query can be used to shortlist the k tuples that best match the user’s preferences. Typically, these preferences regard a subset of the available dimensions (i.e., attributes) whose relative significance is expressed by user-specified weights. Along with the query result, we propose to compute for each involved dimension the maximal deviation to the cor...
متن کاملTop-k Query Answering in Datalog+/- Ontologies under Subjective Reports (Technical Report)
The use of preferences in query answering, both in traditional databases and in ontology-based data access, has recently received much attention, due to its many real-world applications. In this paper, we tackle the problem of top-k query answering in Datalog+/– ontologies subject to the querying user’s preferences and a collection of (subjective) reports of other users. Here, each report consi...
متن کاملEvaluation of Conditional Preference Queries
The need for incorporating preference querying in database technology is a very important issue in a variety of applications ranging from e-commerce to personalized search engines. A lot of recent research work has been dedicated to this topic in the artificial intelligence and database fields. Several formalisms allowing preference reasoning and specification have been proposed in the AI domai...
متن کاملEstimation of Potential Product Using Reverse Top-k Queries
Atpresent, most of the applications return to the user a limited set of ranked results based on the individual user’s preferences, which are commonly validated through top-k queries. From the perspective of a manufacturer, it is imperative that the products appear in the highest ranked positions for many different user preferences, otherwise the product is not visible to the potential customers...
متن کاملEfficient Evaluation of Numerical Preferences: Top k Queries, Skylines and Multi-objective Retrieval
Query processing in databases and information systems has developed beyond mere SQLstyle exact matching of attribute values. Scoring database objects according to numerical user preferences and retrieving only the top k matches or Pareto-optimal result sets (skyline queries) are already common for a variety of applications. Recently a lot of database literature has focussed on how to efficientl...
متن کامل